Goto

Collaborating Authors

 testing task




RealisticEvaluationofTransductive Few-ShotLearning-SupplementaryMaterial

Neural Information Processing Systems

In the main tables of the paper, we did not include the performances ofα-TIM in the standard balanced setting. Now,wewanttoemphasize thatthemodel differences mentioned abovecanbestraightforwardly applied to ourα-TIM (and likely the other methods) in order to boost the results at the cost of a significant increase of compute requirement. We provide the derivation of Eq. (4) in the main paper, which linksα-entropyHα(p) to the αdivergence: The study in [4] examined the effect of class imbalance on the support set after defining several processes togenerate class-imbalanced support sets.


A Appendix for Details of Deriving HTGM 487 A.1 The lower-bound of the likelihood function

Neural Information Processing Systems

In this section, we provide the details of the lower-bound in Eq. This completes the derivation of Eq. (3). In other words, there is no overlap between any pair of balls. The training algorithm of HTGM is summarized in Algorithm 1. As we discussed in Sec. 2, to the best of our knowledge, our proposed method HTGMis the first B.3 Discussion about the related multi-task learning methods In an MTL method, all tasks are known a priori, i.e., the testing tasks are The second difference lies in the generative process.




Cooperative Multi-agent Approach for Automated Computer Game Testing

Shirzadeh-hajimahmood, Samira, Prasteya, I. S. W. B., Dastani, Mehdi, Dignum, Frank

arXiv.org Artificial Intelligence

Automated testing of computer games is a challenging problem, especially when lengthy scenarios have to be tested. Automating such a scenario boils down to finding the right sequence of interactions given an abstract description of the scenario. Recent works have shown that an agent-based approach works well for the purpose, e.g. due to agents' reactivity, hence enabling a test agent to immediately react to game events and changing state. Many games nowadays are multi-player. This opens up an interesting possibility to deploy multiple cooperative test agents to test such a game, for example to speed up the execution of multiple testing tasks. This paper offers a cooperative multi-agent testing approach and a study of its performance based on a case study on a 3D game called Lab Recruits.


Learning Task Embeddings for Teamwork Adaptation in Multi-Agent Reinforcement Learning

Schäfer, Lukas, Christianos, Filippos, Storkey, Amos, Albrecht, Stefano V.

arXiv.org Artificial Intelligence

Successful deployment of multi-agent reinforcement learning often requires agents to adapt their behaviour. In this work, we discuss the problem of teamwork adaptation in which a team of agents needs to adapt their policies to solve novel tasks with limited fine-tuning. Motivated by the intuition that agents need to be able to identify and distinguish tasks in order to adapt their behaviour to the current task, we propose to learn multi-agent task embeddings (MATE). These task embeddings are trained using an encoder-decoder architecture optimised for reconstruction of the transition and reward functions which uniquely identify tasks. We show that a team of agents is able to adapt to novel tasks when provided with task embeddings. We propose three MATE training paradigms: independent MATE, centralised MATE, and mixed MATE which vary in the information used for the task encoding. We show that the embeddings learned by MATE identify tasks and provide useful information which agents leverage during adaptation to novel tasks.


Hyper-Decision Transformer for Efficient Online Policy Adaptation

Xu, Mengdi, Lu, Yuchen, Shen, Yikang, Zhang, Shun, Zhao, Ding, Gan, Chuang

arXiv.org Artificial Intelligence

Decision Transformers (DT) have demonstrated strong performances in offline reinforcement learning settings, but quickly adapting to unseen novel tasks remains challenging. To address this challenge, we propose a new framework, called Hyper-Decision Transformer (HDT), that can generalize to novel tasks from a handful of demonstrations in a data-and parameter-efficient manner. To achieve such a goal, we propose to augment the base DT with an adaptation module, whose parameters are initialized by a hyper-network. When encountering unseen tasks, the hyper-network takes a handful of demonstrations as inputs and initializes the adaptation module accordingly. This initialization enables HDT to efficiently adapt to novel tasks by only fine-tuning the adaptation module. We validate HDT's generalization capability on object manipulation tasks. We find that with a single expert demonstration and fine-tuning only 0.5% of DT parameters, HDT adapts faster to unseen tasks than fine-tuning the whole DT model. Finally, we explore a more challenging setting where expert actions are not available, and we show that HDT outperforms state-of-the-art baselines in terms of task success rates by a large margin. Demos are available on our project page. Building an autonomous agent capable of generalizing to novel tasks has been a longstanding goal of artificial intelligence. Recently, large transformer models have shown strong generalization capability on language understanding when fine-tuned with limited data (Brown et al., 2020; Wei et al., 2021). Such success motivates researchers to apply transformer models to the regime of offline reinforcement learning (RL) (Chen et al., 2021; Janner et al., 2021).


A Simple Approach for General Task-Oriented Picking using Placing constraints

Wang, Jen-Wei, Sun, Lingfeng, Zhu, Xinghao, Qian, Qiyang, Tomizuka, Masayoshi

arXiv.org Artificial Intelligence

Pick-and-place is an important manipulation task in domestic or manufacturing applications. There exist many works focusing on grasp detection with high picking success rate but lacking consideration of downstream manipulation tasks (e.g., placing). Although some research works proposed methods to incorporate task conditions into grasp selection, most of them are data-driven and are therefore hard to adapt to arbitrary operating environments. Observing this challenge, we propose a general task-oriented pick-place framework that treats the target task and operating environment as placing constraints into grasping optimization. Combined with existing grasp detectors, our framework is able to generate feasible grasps for different downstream tasks and adapt to environmental changes without time-consuming re-training processes. Moreover, the framework can accept different definitions of placing constraints, so it is easy to integrate with other modules. Experiments in the simulator and real-world on multiple pick-place tasks are conducted to evaluate the performance of our framework. The result shows that our framework achieves a high and robust task success rate on a wide variety of the pick-place tasks.